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DETAILED ACTION 

Claim Objections 

1 . Claim 80 is objected to because of the following informalities: 

Claim 80 is directed to uploading the additional control signal, which comprises 
replacing, supplementing, or updating an existing control signal, where "the additional 
control signal" lacks express antecedent basis from independent claim 59, upon which 
claim 80 depends. Moreover, Applicants have cancelled all of the corresponding claims 
directed to replacing, supplementing, or updating an existing control signal, so that it is 
questioned whether the failure to cancel claim 80 was an oversight on the part of 
Applicants. 

Appropriate correction is required. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 59, 62 to 64, 66, 69 to 71 , 73, 76 to 78, 81 , and 83 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Basore et al. in view of Besling et al. 
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Concerning independent claims 59, 81 , and 83, Basore et al. discloses a voice 
activated device and method for providing access to remotely retrieved data, 
comprising: 

"a transceiver configured to receive input from the device via a communications 
network" - a digital signal in the form of a digital transmission from handset 110 ("the 
device") is received by voice activated device 120 (column 2, lines 49 to 50: Figure 1); a 
response text is sent from text-to-speech unit 129 of voice activated device 120 and is 
communicated to a user of handset 110 (column 5, lines 16 to 20: Figure 2: Step 250); 
thus, the voice activated device 120 includes "a transceiver" to receive input from a user 
of handset 110, and to send a response to a user of handset 110; broadly, a digital 
transmission sent wirelessly from a cordless handset 1 10 to voice activated device 120 
is an element of "a communications network"; 

"a memory configured to store an acoustic model of the input" - microprocessor 
124 of voice activated device 120 has an associated memory unit 125; memory unit 125 
comprises a phonetic acoustic models database 126; the phonetic acoustic models 
database stores a plurality of models of how phonemes are spoken (column 2, lines 60 
to 65: Figure 1); the acoustic models are designed to recognize the input spoken by a 
user at handset 1 1 0 ("of the input"); 

"a processing module coupled to the transceiver and configured to perform 
speech recognition on the received input based on a previously stored acoustic model 
in order to recognize a command" - voice activated device 120 includes a speech 
recognition unit 128; speech recognition unit 128 is connected to microprocessor 124, 
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or may be a software program suitably running on microprocessor 124 (column 3, lines 
10 to 17: Figure 1); a handset 110 issues a user's voice commands, and the voice 
commands are recognized by speech recognition unit 128 from phonetic acoustic 
models stored ("a previously stored acoustic model") in acoustic models database 1 26 
of voice activated device 120 (column 4, lines 49 to column 5, line 3: Figure 2); 

"wherein the transceiver is further configured to transmit data to the device 
responsive to the command, to enable the device to provide the data in an output 
response" - once a command spoken or issued by the user is recognized, 
microprocessor retrieves an appropriate response from application data stored in 
memory unit 125; the response is communicated to the user; in a preferred 
embodiment, the response text is sent to text-to-speech unit 129 and transformed into 
an acoustic response (column 5, lines 8 to 22: Figure 2: Step 250); alternatively, 
response text may be communicated to a user by a printer 170 or display screen 180 
(column 5, lines 40 to 49: Figure 1 ). 

Concerning independent claims 59, 81, and 83, the only significant element not 
expressly disclosed by Basore et al. is "wherein the acoustic model of the input and the 
previously stored acoustic model are associated with the device to address the specific 
characteristics of additional input received from the device." Arguably, too, Basore et al. 
omits "a communication network" between handset 110 and voice activated device 120 
because any information is only communicated wirelessly to a cordless handset. 
Basore et al. discloses that the phonetic acoustic models are designed for a specific 
microphone so as to increase the speech recognition accuracy in voice activated device 
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120. (Column 2, Lines 38 to 43) Moreover, Basore et al. is designed to receive 
additional application data and acoustic spellings from remote central office, which are 
stored in dictionary 127. (Column 3, Line 66 to Column 4, Line 36: Figure 2: Steps 200, 
210, and 220) Arguably, then, Basore etal. discloses that an acoustic model will 
"address specific characteristics of additional input" because an acoustic model stored 
in memory unit 125 is designed for a specific microphone. Thus, any previously stored 
acoustic model is "associated with the device" because it addresses issues of the input 
being affected by a particular microphone that is being used by handset 110. 

Concerning independent claims 59, 81 , and 83, however, even if Basore et al. 
does not squarely disclose an "acoustic model of the input" and "the previously stored 
acoustic model" for "additional input", it is known in the prior art of speech recognition to 
train and store acoustic models for any new users. Specifically, Besling et al. teaches 
user model-improvement-data-driven selection and update of a user-oriented 
recognition model for speech recognition. Based on acoustic training data, a suitable 
acoustic model is selected or a basic acoustic model is adapted using a suitable 
adaptation profile. Each of the models is, preferably, targeted towards a specific type of 
speech, such as male/female speech, slow speech/fast speech, or speech with different 
accents. An acoustic model that gives the best results is then selected. (Column 5, 
Lines 3 to 14) Model improvement data comprises acoustic training data. A default 
acoustic model is initially selected, and then an acoustic model suitable for a user is 
selected from a plurality of different acoustic models. User station 350 may also extract 
certain acoustic characteristics, and select a best matching model based on the 
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characteristics. (Column 7, line 66 to Column 8, Line 45) Furthermore, Besling et al. 
teaches that at least one user station 350, 360, 370 is connected via communication 
means 312, 352 to a server station 310 by network 330, which may be any suitable 
network, such as a local area network, wide area network, or the Internet. (Column 5, 
Line 63 to Column 4, Line 19: Figure 3) An objective is to enable speech recognition in 
a client-server configuration, without undue training burden on a user, where a server is 
capable of simultaneously supporting recognition for many clients. (Column 4, Lines 14 
to 22) It would have been obvious to one having ordinary skill in the art to include an 
acoustic model of input from a user and previously stored acoustic models to address 
specific characteristics of additional input received as taught by Besling et al. in a voice 
activated device of Basore et al. for a purpose of supporting recognition for many clients 
in a client-server speech recognition configuration. 

Concerning independent claims 66 and 73, Basore et al. discloses a voice 
activated device and method for providing access to remotely retrieved data, 
comprising: 

"receiving an audio input from a device over a network, the audio input based on 
speech input" - a digital signal in the form of a digital transmission from handset 1 1 0 
("the device") is received by voice activated device 120 (column 2, lines 49 to 50: Figure 
1); a digital signal corresponds to a user's analog voice signal from microphone 112 that 
is converted by A/D converter 114 (column 2, lines 36 to 48: Figure 1); broadly, a digital 
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transmission sent wirelessly from a cordless handset 1 10 to voice activated device 120 
is an element of "a network"; 

"storing an acoustic model of the audio input" - memory unit 125 comprises a 
phonetic acoustic models database 126; the phonetic acoustic models database stores 
a plurality of models of how phonemes are spoken (column 2, lines 60 to 65: Figure 1); 
implicitly, the acoustic models are designed to recognize the input spoken by a user at 
handset 110 ("of the audio input"); 

"performing speech recognition on the received audio input based on a 
previously stored acoustic model in order to recognize a command" - voice activated 
device 120 includes a speech recognition unit 128; speech recognition unit 128 is 
connected to microprocessor 124, or may be a software program suitably running on 
microprocessor 124 (column 3, lines 10 to 17: Figure 1); a handset 110 issues a user's 
voice commands, and the voice commands are recognized by speech recognition unit 
1 28 from phonetic acoustic models stored ("based on a previously stored acoustic 
model") in acoustic models database 126 of voice activated device 120 (column 4, lines 
49 to column 5, line 3: Figure 2); 

"transmitting data to the device over the network, responsive to the command, to 
enable the device to provide the data in an output response" - once a command spoken 
or issued by the user is recognized, microprocessor retrieves an appropriate response 
from application data stored in memory unit 125; the response is communicated to the 
user via handset 1 10; in a preferred embodiment, the response text is sent to text-to- 
speech unit 129 and transformed into an acoustic response (column 5, lines 8 to 22: 
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Figure 2: Step 250); alternatively, response text may be communicated to a user by a 
printer 170 or display screen 180 (column 5, lines 40 to 49: Figure 1). 

Concerning independent claims 66 and 73, the only significant element not 
expressly disclosed by Basore et al. is "wherein the acoustic model of the input and the 
previously stored acoustic model are associated with the device to address the specific 
characteristics of additional input received from the device." Arguably, too, Basore et al. 
omits "a communication network" between handset 110 and voice activated device 120 
because any information is only communicated wirelessly to a cordless handset. 
Basore et al. discloses that the phonetic acoustic models are designed for a specific 
microphone so as to increase the speech recognition accuracy in voice activated device 
120. (Column 2, Lines 38 to 43) Moreover, Basore et al. is designed to receive 
additional application data and acoustic spellings from remote central office, which are 
stored in dictionary 127. (Column 3, Line 66 to Column 4, Line 36: Figure 2: Steps 200, 
210, and 220) Arguably, then, Basore et al. discloses that an acoustic model will 
"address specific characteristics of additional input" because an acoustic model stored 
in memory unit 125 is designed for a specific microphone. Thus, any previously stored 
acoustic model is "associated with the device" because it addresses issues of the input 
being affected by a particular microphone that is being used by handset 110. 

Concerning independent claims 66 and 73, however, even if Basore et al. does 
not squarely disclose an "acoustic model of the input" and "the previously stored 
acoustic model" for "additional input", it is known in the prior art of speech recognition to 
train and store acoustic models for any new users. Specifically, Besting et al. teaches 
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user model-improvement-data-driven selection and update of a user-oriented 
recognition model for speech recognition. Based on acoustic training data, a suitable 
acoustic model is selected or a basic acoustic model is adapted using a suitable 
adaptation profile. Each of the models is, preferably, targeted towards a specific type of 
speech, such as male/female speech, slow speech/fast speech, or speech with different 
accents. An acoustic model that gives the best results is then selected. (Column 5, 
Lines 3 to 14) Model improvement data comprises acoustic training data. A default 
acoustic model is initially selected, and then an acoustic model suitable for a user is 
selected from a plurality of different acoustic models. User station 350 may also extract 
certain acoustic characteristics, and select a best matching model based on the 
characteristics. (Column 7, line 66 to Column 8, Line 45) Furthermore, Besling et al. 
teaches that at least one user station 350, 360, 370 is connected via communication 
means 312, 352 to a server station 310 by network 330, which may be any suitable 
network, such as a local area network, wide area network, or the Internet. (Column 5, 
Line 63 to Column 4, Line 19: Figure 3) An objective is to enable speech recognition in 
a client-server configuration, without undue training burden on a user, where a server is 
capable of simultaneously supporting recognition for many clients. (Column 4, Lines 14 
to 22) It would have been obvious to one having ordinary skill in the art to include an 
acoustic model of input from a user and previously stored acoustic models to address 
specific characteristics of additional input received as taught by Besling et al. in a voice 
activated device of Basore et al. for a purpose of supporting recognition for many clients 
in a client-server speech recognition configuration. 
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Concerning claims 62 to 63, 69 to 70, and 76 to 77, Basore et al. discloses that 
once a command spoken or issued by the user is recognized, microprocessor retrieves 
an appropriate response from application data stored in memory unit 125; the response 
is communicated to the user via handset 110; in a preferred embodiment, the response 
text is sent to text-to-speech unit 129 and transformed into an acoustic response 
(column 5, lines 8 to 22: Figure 2: Step 250); alternatively, response text may be 
communicated to a user by a printer 170 or display screen 180 (column 5, lines 40 to 
49: Figure 1 ), thus, data sent by voice activated device 1 20 to handset 1 1 0 can include 
either "audio data" provided by text-to-speech unit or "a text message" for a display 
screen or printer. 

Concerning claims 64, 71 , and 78, Basore et al. discloses that handset 1 1 0 does 
not include a facility for speech recognition or retrieving a response; only voice activated 
device includes speech recognition unit 128 and text-to-speech unit 129, and only voice 
activated device can communicate a response to a user's command (column 3, lines 10 
to 15: Figure 1 ; column 5, lines 8 to 22: Figure 2: Steps 240 and 250); thus, handset 
110 ("the device") is not capable of processing the input voice command. 

4. Claims 61, 65, 68, 72, 75, and 79 to 80 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Basore et al. in view of Besling et al. as applied to claims 59, 
66, and 73 above, and further in view of Houser et al. 
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Concerning claims 61, 68, and 75, Basore etal. discloses an application of voice 
commands for retrieving information about what programs are on television. (Column 5, 
Lines 8 to 40: Figure 3) However, Basore et al. does not provide video data as data 
that is a response to a user command, but only audio and text responses. However, it 
is known for speech commands to access information in a variety of forms. Specifically, 
Houser et al. teaches a speech interface for controlling a device such as a television 
and for controlling access to broadcast information such as video, audio, and/or text 
information in accordance with recognized utterances of a user. (Abstract) An objective 
is to afford ease of use as well as permitting the implementation of tasks which are not 
easily implemented using menu screens and key presses. (Column 2, Lines 23 to 29) 
It would have been obvious to one having ordinary skill in the art to provide video data 
to a user in response to a user's voice command as taught by Houser et al. from a voice 
activated device for requesting information about what's on television of Basore et al. for 
a purpose of permitting implementation of tasks which would be difficult to perform 
using menu screens and key presses. 

Concerning claims 65, 72, and 79, Houser etal. discloses that information is 
retrieved from an information distribution center 12 in response to commands from 
terminal unit 16 for accessing information transmitted by information distribution center 
12 (column 5, line 39 to column 6, line 14: Figure 1); additionally, electronic 
programming guide (EPG) data is accessed from an information provider 114-3, 
including television schedule information arranged by time and channel, and transmitted 
to subscriber units (column 22, line 19 to 51 : Figure 2C). Similarly, Basore et al. 
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discloses that voice activated device 120 retrieves application data from central office 
160 upon request or demand from a user. (Column 3, Lines 60 to 13: Figure 2: Steps 
200, 210, and 220) 

Concerning claim 80, Houseret al. discloses that second vocabulary data may 
be downloaded from head-end installation 125, where the second vocabulary data 
permits a user to use spoken controls to implement basic television control, as well as 
control of VCR 162-2 and access to EPG data; second vocabulary permits a user to use 
spoken controls to implement basic television control, EPG control, VCR control, and 
event programming (column 23, lines 38 to 50: Figure 2C); the second vocabulary 
includes the vocabulary of Table I above and additional vocabulary of Table II below 
(column 24, lines 1 to 34); thus, the second vocabulary provides controls for a VCR and 
an EPG that are at least "supplementing" to the first vocabulary of Table I. 



Response to Arguments 

5. Applicants' arguments filed 04 September 2009 have been considered but are 
moot in view of the new grounds of rejection, necessitated by amendment. 

Applicants have presented substantially new subject matter in the amended 
claims, requiring further search and consideration. Thus, Applicants' arguments are 
moot. However, it is noted that Applicants have elected not to pursue the subject matter 
suggested in the telephone interview of 01 September 2009. During that interview, it 
was suggested that Applicants amend the claims to incorporate subject matter directed 
to the disclosure from U[0031] of United States Patent Publication 2002/0072918, 
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corresponding to the current application. Instead, Applicants have elected to amend the 
claims to incorporate subject matter directed to the disclosure from U[0088] of 
corresponding United States Patent Publication 2002/0072918. Of course, Applicants 
are free to pursue whatever subject matter they wish towards patentability. Still, it is 
maintained that the currently amended claims 59, 62 to 64, 66, 69 to 71 , 73, 76 to 78, 
81 , and 83 are obvious under 35 U.S.C. §1 03(a) as being unpatentable over Basore et 
al. in view of Besling et al., and claims 61 , 65, 68, 72, 75, and 79 to 80 are obvious 
under 35 U.S.C. §1 03(a) as being unpatentable over Basore et al. in view of Besling et 
al., and further in view of Houser et al. 



Conclusion 

6. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Lapere, Sherwood et al., Balakrishnan et al., Kahn et al., Wong, Kanevsky et al. 
(700), and Scruggs et al. disclose related art. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (571) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R. Hudspeth can be reached on (571) 272-7843. The fax phone 
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number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Martin Lerner/ 
Primary Examiner 
Art Unit 2626 
September 25, 2009 



